Skip to content

Updating the FieldMappers to insert call addField of DocumentInput#21001

Open
darjisagar7 wants to merge 1 commit intoopensearch-project:mainfrom
darjisagar7:DocumentInput
Open

Updating the FieldMappers to insert call addField of DocumentInput#21001
darjisagar7 wants to merge 1 commit intoopensearch-project:mainfrom
darjisagar7:DocumentInput

Conversation

@darjisagar7
Copy link
Copy Markdown
Contributor

@darjisagar7 darjisagar7 commented Mar 26, 2026

Description

This PR adds the following functionalities:

  1. Introducing a new Feature Flag for the Pluggable dataformat feature. [RFC] OpenSearch Engine: Pluggable Component Bundling #20644
  2. It adds a new IndexSettings named "index.pluggable.dataformat.enabled", which can be used along with feature flag to enable the pluggable dataformat for an Index.
  3. This PR also ensures that the FieldMapper are decoupled from the Lucene's code. Based on the pluggable dataformat architecture the data will be added to DocumentInput which will add the data to respective dataformat.

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 7f26163: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 80f72cd.

PathLineSeverityDescription
modules/parent-join/src/main/java/org/opensearch/join/mapper/ParentJoinFieldMapper.java430lowThe `context.path().remove()` call is omitted from the pluggable-dataformat branch. In the original code this cleanup was always executed after adding fields, but in the refactored block it only runs in the legacy `else` path. When the feature is enabled, the path state is not cleaned up, which could cause subtle document-parsing state corruption. This appears to be an unintentional omission rather than malicious, but is worth investigation.

The table above displays the top 10 most important findings.

Total: 1 | Critical: 0 | High: 0 | Medium: 0 | Low: 1


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for 93426fa: SUCCESS

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 26, 2026

Codecov Report

❌ Patch coverage is 81.34328% with 25 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.20%. Comparing base (15fcc08) to head (216d68a).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
...h/index/mapper/ICUCollationKeywordFieldMapper.java 55.55% 1 Missing and 3 partials ⚠️
.../opensearch/index/mapper/size/SizeFieldMapper.java 0.00% 3 Missing ⚠️
.../org/opensearch/index/mapper/RangeFieldMapper.java 40.00% 1 Missing and 2 partials ⚠️
...earch/index/mapper/SearchAsYouTypeFieldMapper.java 75.00% 1 Missing and 1 partial ⚠️
...rg/opensearch/join/mapper/ParentIdFieldMapper.java 60.00% 1 Missing and 1 partial ⚠️
.../opensearch/join/mapper/ParentJoinFieldMapper.java 60.00% 1 Missing and 1 partial ⚠️
.../java/org/opensearch/index/mapper/FieldMapper.java 60.00% 1 Missing and 1 partial ⚠️
...pensearch/index/mapper/ScaledFloatFieldMapper.java 87.50% 0 Missing and 1 partial ⚠️
...rg/opensearch/index/mapper/BooleanFieldMapper.java 88.88% 0 Missing and 1 partial ⚠️
...a/org/opensearch/index/mapper/DateFieldMapper.java 91.66% 0 Missing and 1 partial ⚠️
... and 4 more
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #21001      +/-   ##
============================================
- Coverage     73.26%   73.20%   -0.06%     
- Complexity    72743    72781      +38     
============================================
  Files          5862     5862              
  Lines        332558   332608      +50     
  Branches      48010    48027      +17     
============================================
- Hits         243643   243500     -143     
- Misses        69343    69566     +223     
+ Partials      19572    19542      -30     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for 2fc592f: SUCCESS

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 30, 2026

PR Reviewer Guide 🔍

(Review updated until commit 216d68a)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Add PLUGGABLE_DATAFORMAT feature flag and index setting infrastructure

Relevant files:

  • server/src/main/java/org/opensearch/common/util/FeatureFlags.java
  • server/src/main/java/org/opensearch/common/settings/FeatureFlagSettings.java
  • server/src/main/java/org/opensearch/index/IndexSettings.java
  • server/src/main/java/org/opensearch/common/settings/IndexScopedSettings.java
  • server/src/main/java/org/opensearch/index/mapper/Mapper.java

Sub-PR theme: Integrate DocumentInput into DocumentParser and ParseContext plumbing

Relevant files:

  • server/src/main/java/org/opensearch/index/mapper/DocumentMapper.java
  • server/src/main/java/org/opensearch/index/mapper/DocumentParser.java
  • server/src/main/java/org/opensearch/index/mapper/ParseContext.java
  • server/src/main/java/org/opensearch/index/mapper/ParsedDocument.java
  • server/src/main/java/org/opensearch/index/mapper/FieldMapper.java
  • server/src/test/java/org/opensearch/index/mapper/DocumentMapperTests.java
  • server/src/test/java/org/opensearch/index/mapper/DocumentParserTests.java
  • test/framework/src/main/java/org/opensearch/index/mapper/MapperServiceTestCase.java

Sub-PR theme: Update all FieldMappers to call DocumentInput.addField when feature enabled

Relevant files:

  • server/src/main/java/org/opensearch/index/mapper/BinaryFieldMapper.java
  • server/src/main/java/org/opensearch/index/mapper/BooleanFieldMapper.java
  • server/src/main/java/org/opensearch/index/mapper/DateFieldMapper.java
  • server/src/main/java/org/opensearch/index/mapper/IpFieldMapper.java
  • server/src/main/java/org/opensearch/index/mapper/KeywordFieldMapper.java
  • server/src/main/java/org/opensearch/index/mapper/NumberFieldMapper.java
  • server/src/main/java/org/opensearch/index/mapper/RangeFieldMapper.java
  • server/src/main/java/org/opensearch/index/mapper/TextFieldMapper.java
  • modules/mapper-extras/src/main/java/org/opensearch/index/mapper/ScaledFloatFieldMapper.java
  • modules/mapper-extras/src/main/java/org/opensearch/index/mapper/SearchAsYouTypeFieldMapper.java
  • modules/mapper-extras/src/main/java/org/opensearch/index/mapper/TokenCountFieldMapper.java
  • modules/parent-join/src/main/java/org/opensearch/join/mapper/ParentIdFieldMapper.java
  • modules/parent-join/src/main/java/org/opensearch/join/mapper/ParentJoinFieldMapper.java
  • plugins/analysis-icu/src/main/java/org/opensearch/index/mapper/ICUCollationKeywordFieldMapper.java
  • plugins/mapper-size/src/main/java/org/opensearch/index/mapper/size/SizeFieldMapper.java
  • server/src/test/java/org/opensearch/index/mapper/BinaryFieldMapperTests.java
  • server/src/test/java/org/opensearch/index/mapper/BooleanFieldMapperTests.java
  • server/src/test/java/org/opensearch/index/mapper/DateFieldMapperTests.java
  • server/src/test/java/org/opensearch/index/mapper/IpFieldMapperTests.java
  • server/src/test/java/org/opensearch/index/mapper/KeywordFieldMapperTests.java
  • server/src/test/java/org/opensearch/index/mapper/NumberFieldMapperTests.java
  • server/src/test/java/org/opensearch/index/mapper/TextFieldMapperTests.java
  • modules/mapper-extras/src/test/java/org/opensearch/index/mapper/ScaledFloatFieldMapperTests.java
  • modules/mapper-extras/src/test/java/org/opensearch/index/mapper/SearchAsYouTypeFieldMapperTests.java
  • modules/mapper-extras/src/test/java/org/opensearch/index/mapper/TokenCountFieldMapperTests.java

⚡ Recommended focus areas for review

Unused Field

An AtomicBoolean field isPluggableDataFormatFeatureEnabled is declared as an instance variable but never used. The actual check is done via the method isPluggableDataFormatFeatureEnabled(ParseContext) which delegates to isPluggableDataFormatEnabled(Settings). This unused field should be removed to avoid confusion.

private final AtomicBoolean isPluggableDataFormatFeatureEnabled = new AtomicBoolean(false);
Null documentInput

The documentInput() method can return null when no DocumentInput is provided. All callers that invoke context.documentInput().addField(...) only do so after checking isPluggableDataFormatFeatureEnabled(context), but if the feature flag is enabled and documentInput is null (e.g., called via the old parse(SourceToParse) path), a NullPointerException will occur. There is no null guard before calling addField.

@Override
public DocumentInput documentInput() {
    return this.documentInput;
}
Misleading Javadoc

The Javadoc for isPluggableDataFormatEnabled says "Requires both the PLUGGABLE_DATAFORMAT_EXPERIMENTAL_FLAG feature flag" but the sentence is incomplete — it doesn't mention the second requirement (the index-level setting). The comment should be updated to clearly state both conditions required.

/**
 * Checks if the optimised index feature is enabled for the given settings.
 * Requires both the {@link FeatureFlags#PLUGGABLE_DATAFORMAT_EXPERIMENTAL_FLAG} feature flag
 *
 * @param settings the index settings to check
 * @return {@code true} if the pluggable dataformat feature flag and the optimised index setting are both enabled
 */
public static boolean isPluggableDataFormatEnabled(Settings settings) {
    return FeatureFlags.isEnabled(FeatureFlags.PLUGGABLE_DATAFORMAT_EXPERIMENTAL_FLAG)
        && PLUGGABLE_DATAFORMAT_ENABLED_SETTING.get(settings);
}
Feature Flag + Null Risk

When isPluggableDataFormatFeatureEnabled returns true but documentInput is null (e.g., the feature flag is enabled at node level but the caller uses the old parse(SourceToParse) API), all field mappers will call context.documentInput().addField(...) and throw a NullPointerException. There should be a validation or guard in parseDocument to ensure documentInput is non-null when the feature is enabled.

ParsedDocument parseDocument(SourceToParse source, MetadataFieldMapper[] metadataFieldsMappers, DocumentInput documentInput)
    throws MapperParsingException {
    final Mapping mapping = docMapper.mapping();
    final ParseContext.InternalParseContext context;
    final MediaType mediaType = source.getMediaType();

    try (
        XContentParser parser = XContentHelper.createParser(
            docMapperParser.getXContentRegistry(),
            LoggingDeprecationHandler.INSTANCE,
            source.source(),
            mediaType
        )
    ) {
        context = new ParseContext.InternalParseContext(indexSettings, docMapperParser, docMapper, source, parser, documentInput);
Settings Duplication

In createMapperService(Settings settings, XContentBuilder mapping), the IndexMetadata is built with both settings merged in and then IndexSettings is also constructed with the same settings. This could lead to settings being applied twice or inconsistently. The existing createMapperService(Version, XContentBuilder) pattern should be reviewed to ensure the new overload is consistent.

protected final MapperService createMapperService(Settings settings, XContentBuilder mapping) throws IOException {
    IndexMetadata meta = IndexMetadata.builder("index")
        .settings(Settings.builder().put("index.version.created", Version.CURRENT).put(settings))
        .numberOfReplicas(0)
        .numberOfShards(1)
        .build();
    IndexSettings indexSettings = new IndexSettings(meta, settings);
    MapperRegistry mapperRegistry = new IndicesModule(
        getPlugins().stream().filter(p -> p instanceof MapperPlugin).map(p -> (MapperPlugin) p).collect(toList())
    ).getMapperRegistry();
    ScriptModule scriptModule = new ScriptModule(
        Settings.EMPTY,
        getPlugins().stream().filter(p -> p instanceof ScriptPlugin).map(p -> (ScriptPlugin) p).collect(toList())
    );
    ScriptService scriptService = new ScriptService(settings, scriptModule.engines, scriptModule.contexts);
    SimilarityService similarityService = new SimilarityService(indexSettings, scriptService, emptyMap());
    MapperService mapperService = new MapperService(
        indexSettings,
        createIndexAnalyzers(indexSettings),
        xContentRegistry(),
        similarityService,
        mapperRegistry,
        () -> {
            throw new UnsupportedOperationException();
        },
        () -> true,
        scriptService
    );
    merge(mapperService, mapping);
    return mapperService;
}

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 30, 2026

PR Code Suggestions ✨

Latest suggestions up to 216d68a

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Guard against null DocumentInput to prevent NPE

The documentInput() method can return null when no DocumentInput is provided (e.g.,
in the default parseDocument path). All callers that invoke
context.documentInput().addField(...) inside isPluggableDataFormatFeatureEnabled
branches will throw a NullPointerException if documentInput is null. Consider adding
a null-check guard or returning a no-op DocumentInput instead of null.

server/src/main/java/org/opensearch/index/mapper/ParseContext.java [778-779]

 @ExperimentalApi
 public abstract DocumentInput<?> documentInput();
 
+// In InternalParseContext.documentInput():
+@Override
+public DocumentInput documentInput() {
+    return this.documentInput != null ? this.documentInput : NoOpDocumentInput.INSTANCE;
+}
+
Suggestion importance[1-10]: 7

__

Why: When documentInput is null (default parse path), calling context.documentInput().addField(...) inside isPluggableDataFormatFeatureEnabled branches would throw a NullPointerException. However, the isPluggableDataFormatFeatureEnabled check should only be true when a DocumentInput is provided, so this may be a design-level concern rather than an immediate bug. The improved_code doesn't fully reflect the suggestion (it shows the abstract method unchanged), reducing its score.

Medium
General
Include sub-fields in pluggable dataformat path

In the pluggable dataformat path, only the root field's value is passed to
documentInput().addField(...), but the shingle sub-fields and prefix field are
silently skipped. If the DocumentInput consumer needs all sub-field values (e.g.,
for full-text search support), they should also be added. At minimum, this
behavioral difference should be documented or the sub-fields should be added as
well.

modules/mapper-extras/src/main/java/org/opensearch/index/mapper/SearchAsYouTypeFieldMapper.java [658-669]

 if (isPluggableDataFormatFeatureEnabled(context)) {
     context.documentInput().addField(fieldType(), value);
+    for (ShingleFieldMapper subFieldMapper : shingleFields) {
+        context.documentInput().addField(subFieldMapper.fieldType(), value);
+    }
+    context.documentInput().addField(prefixField.fieldType(), value);
 } else {
     context.doc().add(new Field(fieldType().name(), value, fieldType().fieldType));
     for (ShingleFieldMapper subFieldMapper : shingleFields) {
         context.doc().add(new Field(subFieldMapper.fieldType().name(), value, subFieldMapper.getLuceneFieldType()));
     }
     context.doc().add(new Field(prefixField.fieldType().name(), value, prefixField.getLuceneFieldType()));
     if (fieldType().fieldType.omitNorms()) {
         createFieldNamesField(context);
     }
 }
Suggestion importance[1-10]: 6

__

Why: In the pluggable dataformat path, shingle sub-fields and the prefix field are skipped, creating an asymmetry with the non-pluggable path. If the DocumentInput consumer needs all sub-field values for correct behavior, this omission could cause functional issues. The suggestion is valid and the improved_code accurately reflects the change.

Low
Remove unused AtomicBoolean field in FieldMapper

The AtomicBoolean isPluggableDataFormatFeatureEnabled field is declared but never
read or written — the actual check is delegated to the static
isPluggableDataFormatEnabled(Settings) method in Mapper. This unused field adds
confusion and unnecessary memory overhead per FieldMapper instance. It should be
removed.

server/src/main/java/org/opensearch/index/mapper/FieldMapper.java [221]

-private final AtomicBoolean isPluggableDataFormatFeatureEnabled = new AtomicBoolean(false);
+// Remove the unused field entirely:
+// private final AtomicBoolean isPluggableDataFormatFeatureEnabled = new AtomicBoolean(false);
Suggestion importance[1-10]: 5

__

Why: The isPluggableDataFormatFeatureEnabled AtomicBoolean field is declared but never used — the actual check delegates to Mapper.isPluggableDataFormatEnabled(Settings). This is dead code that adds confusion and per-instance memory overhead.

Low
Ensure feature flag settings are passed as node settings

The IndexSettings constructor is called with the custom settings as the node
settings argument, but node-level settings (like the
PLUGGABLE_DATAFORMAT_EXPERIMENTAL_FLAG feature flag) should be passed as node
settings, not index settings. This means the feature flag check via
FeatureFlags.isEnabled(...) (which reads node settings) may not work correctly in
tests. The node settings should include the feature flag setting separately.

server/src/test/java/org/opensearch/index/mapper/MapperServiceTestCase.java [188-194]

 protected final MapperService createMapperService(Settings settings, XContentBuilder mapping) throws IOException {
     IndexMetadata meta = IndexMetadata.builder("index")
         .settings(Settings.builder().put("index.version.created", Version.CURRENT).put(settings))
         .numberOfReplicas(0)
         .numberOfShards(1)
         .build();
+    // Pass settings as both index and node settings so feature flags are visible
     IndexSettings indexSettings = new IndexSettings(meta, settings);
Suggestion importance[1-10]: 4

__

Why: The improved_code is essentially identical to the existing_code with only a comment added, making the suggestion unclear in its implementation. The concern about node vs. index settings for feature flags is valid but the suggestion doesn't provide a concrete fix.

Low

Previous suggestions

Suggestions up to commit 6036bb4
CategorySuggestion                                                                                                                                    Impact
General
Separate index settings from node settings in test helper

The IndexSettings constructor is called with settings as the node settings, but
settings here contains index-scoped settings (e.g.,
index.pluggable.dataformat.enabled). Node settings and index settings are separate
concerns; passing index settings as node settings may cause unexpected behavior or
settings validation failures. Node settings should be Settings.EMPTY or a separate
node-level settings object.

test/framework/src/main/java/org/opensearch/index/mapper/MapperServiceTestCase.java [188-194]

 protected final MapperService createMapperService(Settings settings, XContentBuilder mapping) throws IOException {
     IndexMetadata meta = IndexMetadata.builder("index")
         .settings(Settings.builder().put("index.version.created", Version.CURRENT).put(settings))
         .numberOfReplicas(0)
         .numberOfShards(1)
         .build();
-    IndexSettings indexSettings = new IndexSettings(meta, settings);
+    IndexSettings indexSettings = new IndexSettings(meta, Settings.EMPTY);
Suggestion importance[1-10]: 6

__

Why: Passing index-scoped settings as node settings to IndexSettings constructor is semantically incorrect and could cause settings validation issues. Using Settings.EMPTY for node settings is the correct approach, as shown in the existing createMapperService(Version, XContentBuilder) method pattern.

Low
Remove or properly implement unused cached field

The isPluggableDataFormatFeatureEnabled field is declared as volatile Boolean but is
initialized to null and never actually written to after construction. The
isPluggableDataFormatFeatureEnabled(ParseContext) method always calls
isOptimisedIndexEnabled directly without using this field, making it dead code.
Either remove the field or implement the intended caching logic to avoid confusion.

server/src/main/java/org/opensearch/index/mapper/FieldMapper.java [220]

-protected volatile Boolean isPluggableDataFormatFeatureEnabled;
+// Remove the unused field, or implement caching:
+// private volatile Boolean isPluggableDataFormatFeatureEnabled;
Suggestion importance[1-10]: 5

__

Why: The isPluggableDataFormatFeatureEnabled field is declared and initialized to null but never read or written after construction, making it dead code. The actual method isPluggableDataFormatFeatureEnabled(ParseContext) bypasses it entirely, so the field serves no purpose and adds confusion.

Low
Fix wildcard return type inconsistency on documentInput

The abstract documentInput() method returns DocumentInput<?> with a wildcard, but the
concrete implementations in InternalParseContext and FilterParseContext store and
return a raw DocumentInput. Callers that invoke
context.documentInput().addField(...) will get an unchecked call on the wildcard
type. The return type should be consistent (e.g., DocumentInput without wildcard, or
a bounded type) to avoid potential ClassCastException at runtime.

server/src/main/java/org/opensearch/index/mapper/ParseContext.java [778-779]

 @ExperimentalApi
-public abstract DocumentInput<?> documentInput();
+public abstract DocumentInput documentInput();
Suggestion importance[1-10]: 3

__

Why: The wildcard DocumentInput<?> return type is a minor type-safety concern, but callers using addField on the raw wildcard won't cause ClassCastException since addField takes Object. The practical impact is low, and the suggestion's improved_code uses a raw type which is also not ideal.

Low
Possible issue
Ensure DocumentInput is closed after parsing

The documentInput passed into parsedDocument is the same instance provided by the
caller, but it is never closed after parsing completes (even on exception). If
DocumentInput holds resources (e.g., buffers), they will leak. The documentInput
should be closed in a finally block or via try-with-resources after parsing is done.

server/src/main/java/org/opensearch/index/mapper/DocumentParser.java [117]

+try {
+    context = new ParseContext.InternalParseContext(indexSettings, docMapperParser, docMapper, source, parser, documentInput);
+    validateStart(parser);
+    internalParseDocument(mapping, metadataFieldsMappers, context, parser);
+    validateEnd(parser);
+} catch (Exception e) {
+    // existing exception handling
+}
+// ... existing path check and postParse ...
 return parsedDocument(source, context, createDynamicUpdate(mapping, docMapper, context.getDynamicMappers()), documentInput);
+// Note: close documentInput in a finally block wrapping the entire method body
Suggestion importance[1-10]: 4

__

Why: If DocumentInput holds resources, not closing it could cause leaks. However, the improved_code doesn't actually show the finally block closing the documentInput, making the suggestion incomplete. The concern is valid but the implementation guidance is insufficient.

Low
Suggestions up to commit 6101886
CategorySuggestion                                                                                                                                    Impact
Possible issue
Avoid stale cache across different index settings

The cached value is computed from parseContext.indexSettings(), but FieldMapper
instances are reused across multiple parse operations that may have different index
settings. Caching this on the instance can cause stale results if the same mapper is
used with different settings. The check should be performed directly without
instance-level caching.

server/src/main/java/org/opensearch/index/mapper/FieldMapper.java [380-387]

 protected final boolean isPluggableDataFormatFeatureEnabled(ParseContext parseContext) {
-    if (isPluggableDataFormatFeatureEnabled == -1) {
-        boolean featureFlagEnabled = isOptimisedIndexEnabled(parseContext.indexSettings().getSettings());
-        isPluggableDataFormatFeatureEnabled = featureFlagEnabled ? 1 : 0;
-    }
-
-    return isPluggableDataFormatFeatureEnabled == 1;
+    return isOptimisedIndexEnabled(parseContext.indexSettings().getSettings());
 }
Suggestion importance[1-10]: 8

__

Why: FieldMapper instances can be reused across parse operations with different IndexSettings, so caching isPluggableDataFormatFeatureEnabled on the instance can return stale results. Removing the caching and computing the value directly from parseContext.indexSettings() each time is safer and correct.

Medium
Pass custom settings into index metadata builder

The IndexMetadata is built with a hardcoded settings builder that ignores the
provided settings parameter for the metadata itself. Index-scoped settings like
index.pluggable.dataformat.enabled need to be present in the IndexMetadata settings
to be picked up by IndexSettings. The provided settings should be merged into the
metadata settings.

test/framework/src/main/java/org/opensearch/index/mapper/MapperServiceTestCase.java [188-194]

 protected final MapperService createMapperService(Settings settings, XContentBuilder mapping) throws IOException {
     IndexMetadata meta = IndexMetadata.builder("index")
-        .settings(Settings.builder().put("index.version.created", Version.CURRENT))
+        .settings(Settings.builder().put("index.version.created", Version.CURRENT).put(settings))
         .numberOfReplicas(0)
         .numberOfShards(1)
         .build();
     IndexSettings indexSettings = new IndexSettings(meta, settings);
Suggestion importance[1-10]: 8

__

Why: The IndexMetadata is built ignoring the provided settings, so index-scoped settings like index.pluggable.dataformat.enabled won't be present in the metadata and won't be picked up by IndexSettings. Merging the provided settings into the metadata builder is necessary for the pluggable dataformat tests to work correctly.

Medium
Fix thread-safety of cached feature flag field

Using an int field as a tri-state cache for a boolean value is fragile and
non-idiomatic. Additionally, FieldMapper instances can be shared across threads,
making this mutable instance field a thread-safety concern. Consider using a
volatile Boolean (nullable) or computing the value without caching it on the
instance, since ParseContext is already available at call time.

server/src/main/java/org/opensearch/index/mapper/FieldMapper.java [220-233]

-protected int isPluggableDataFormatFeatureEnabled;
+protected volatile Boolean isPluggableDataFormatFeatureEnabled;
 ...
-this.isPluggableDataFormatFeatureEnabled = -1;
+this.isPluggableDataFormatFeatureEnabled = null;
Suggestion importance[1-10]: 7

__

Why: Using int as a tri-state cache for a boolean is non-idiomatic and the mutable instance field isPluggableDataFormatFeatureEnabled on a potentially shared FieldMapper instance raises thread-safety concerns. Switching to volatile Boolean (nullable) is a cleaner and safer approach.

Medium
General
Add generic type parameter to avoid raw type usage

The documentInput() method returns a raw DocumentInput type without a generic type
parameter, which loses type safety. It should use a wildcard or a consistent generic
bound to match the typed usages in the codebase and avoid unchecked cast warnings at
call sites.

server/src/main/java/org/opensearch/index/mapper/ParseContext.java [778-779]

 @ExperimentalApi
-public abstract DocumentInput documentInput();
+public abstract DocumentInput<?> documentInput();
Suggestion importance[1-10]: 4

__

Why: The raw DocumentInput return type loses type safety and may cause unchecked cast warnings at call sites. Using DocumentInput<?> is a minor but worthwhile improvement for type safety.

Low

@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for 6101886: SUCCESS

* @param settings the index settings to check
* @return {@code true} if the pluggable dataformat feature flag and the optimised index setting are both enabled
*/
public static boolean isOptimisedIndexEnabled(Settings settings) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rename this to isPluggableDataFormatEnabled?

@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 6036bb4

protected MultiFields multiFields;
protected CopyTo copyTo;
protected DerivedFieldGenerator derivedFieldGenerator;
protected int isPluggableDataFormatFeatureEnabled;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should follow an AtomicReference#compareAndSet

@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for 6036bb4: SUCCESS

1. Introducing FeatureFlag and Index Setting for pluggable dataformat feature.
2. Updating the FieldMappers to insert fields in DocumentInput for Multi Format Engine

Signed-off-by: Sagar Darji <darsaga@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 216d68a

@darjisagar7 darjisagar7 marked this pull request as ready for review March 31, 2026 08:17
@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for 216d68a: SUCCESS

@mgodwan
Copy link
Copy Markdown
Member

mgodwan commented Apr 1, 2026

  1. For mapper plugins, we need to provide a separate way to allow them to provide how those fields should be written in a data format.
  2. Wiring of documentInput into ParseContext would come from the engine, right i.e. prepareIndex?

Copy link
Copy Markdown
Contributor

@Bukhtawar Bukhtawar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to have a clean branching using something like this in FieldMapper

public void parse(ParseContext context) throws IOException {
    try {
        if (isPluggableDataFormatFeatureEnabled(context)) {
            parseCreateFieldPluggable(context);
        } else {
            parseCreateField(context);
        }
        extractGroupingCriteriaParams(context);
    } catch (Exception e) {
        // ... existing error handling
    }
    multiFields.parse(this, context);
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants